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C*") ' Abstract 

In this work, we obtain performance guarantees for modified-CS and for its improved version, modified-CS-Add-LS-Del, for 
O^l ' recursive reconstruction of a time sequence of sparse signals from a reduced set of noisy measurements available at each time. 

Under mild assumptions, we show that the support recovery error of both algorithms is bounded by a time-invariant and small 
. value at all times. The same is also true for the reconstruction error. Under a slow support change assumption, (i) the support 

recovery error bound is small compared to the support size; and (ii) our results hold under weaker assumptions on the number 
of measurements than what simple compressive sensing (CS) / basis pursuit denoising needs. We do the above for two types of 
£C) • signal change models. The first is a simple model that may often not be realistic. However it is used to illustrate the key ideas 

and it allows for easy comparison of the various results. The second one is a more complicated but realistic signal change model 
and includes the first model as a special case. 

' I. Introduction 

O ■ 

The static sparse reconstruction problem has been studied for a while 0, 0, J4). The papers on compressive sensing (CS) 
from 2005 0, 0, 0, 0, 0, iflOl (and many other more recent works) provide the missing theoretical guarantees - conditions 
J> , for exact recovery and error bounds when exact recovery is not possible. In more recent works, the problem of recursively 
recovering a time sequence of sparse signals, with slowly changing sparsity patterns has also been studied IfTTI . IfTSI . |[T3l , 
' fl4l . El, fl6l , H3, HI) . By "recursive" reconstruction, we mean that we want to use only the current measurements' vector 
, and the previous reconstructed signal to recover the current signal. This problem occurs in many applications such as real-time 
' dynamic magnetic resonance imaging (MRI); single-pixel camera based real-time video imaging; recursively separating the 
■ region of the brain that is activated in response to a stimulus from brain functional MRI (fMRI) sequences fT9l and recursively 
| extracting sparse foregrounds (e.g. moving objects) from slow-changing (low-dimensional) backgrounds in video sequences 
For other potential applications, see ||2D . El . 
An important assumption introduced and empirically verified in QTJ, Ifl2l is that for many natural signal/image sequences, 
\ the sparsity pattern (support set of its projection into the sparsity basis) changes slowly over time. In |[T3l , the authors exploited 
this fact to reformulate the above problem as one of sparse recovery with partially known support and introduced a solution 
approach called modified-CS. Given the partial support knowledge T, modified-CS tries to find a signal that is sparsest outside 
of T among all signals that satisfy the data constraint. Exact recovery conditions were obtained for modified-CS and it was 
argued that these are weaker than those for simple CS (basis pursuit) under the slow support change assumption. Related ideas 
for support recovery with prior knowledge about the support entries, that appeared in parallel, include l|23l , ll24l . All of |fl3ll , 
l23l and l24l studied the noise-free measurements' case. Later work includes 11251 . fl26l . 

Error bounds for modified-CS for noisy measurements were obtained in (27), ||28ll , 0. When modified-CS is used for 
recursive reconstruction, these bounds tell us that the reconstruction error bound at the current time is proportional to the 
support recovery error (misses and extras in the support estimate) from the previous time. Unless we impose extra conditions, 
this support error can keep increasing over time, in which case the bound is not useful. Thus, for recursive reconstruction, the 
important question is, under what conditions can we obtain time-invariant bounds on the support error (which will, in turn, 
imply time-invariant bounds on the reconstruction error)? In other words, when can we ensure "stability" over time? Notice 
that, even if we did nothing, i.e. we set x t = 0, the support error will be bounded by the support size. If the support size is 
bounded, then this is a naive stability result too, but is not useful. Here, we look for results in which the support error bound 
is small compared to the support size. 



This work was supported by NSF grants ECCS-0725849 and CCF-0917015. A part of this work was presented at Allerton 2010 Q and another part will 
be presented at ISIT 2013. The results in this paper are a significant generalization of both conference papers and include all the proofs (which are missing 
in the ISIT paper due to lack of space). 
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Fig. 1: Slow support change in medical image sequences. The two-level Daubechies-4 2D discrete wavelet transform (DWT) served as the 
sparsity basis. Since real image sequences are only approximately sparse, we use Nt to denote the 99%-energy support of the DWT of these 
sequences. The support size, |iVt|, was 6-7% of the image size for both sequences. We plot the number of additions (left) and the number 
of removals (right) as a fraction of |JVt|. Notice that all changes are less than 2% of the support size. 



Stability over time has not been studied much for recursive recovery of sparse signal sequences. To the best of our knowledge, 
it has only been been addressed in ||29l . 0~2), and in very recent work [18]. The result of |[T8l is for exact dynamic support 
recovery in the noise-free case and it studies a different problem: the multiple measurement vector (MMV) version of the 
recursive recovery problem. The result from [29] for Kalman filtered compressed sensing (KF-CS) stability holds under strong 
assumptions, e.g. its assumptions on the measurement matrix are as strong as those needed by simple CS. The result from ifPH 
for Least Squares CS-residual (LS-CS) stability) holds under mostly mild assumptions. However its key limitation is that it 
assumes that support changes occur every p frames. But from testing the slow support change assumption for real data (medical 
image sequences), it has been observed that support changes usually occur at every time, e.g. see Fig. Q] This important case 
is the focus of the current work. We explain the differences of our results w.r.t. the LS-CS result in detail later in Sec IIV-EI 



A. Contributions 

In this work, we introduce modified-CS-add-LS-del which is a modified-CS based algorithm for recursive recovery with an 
improved support estimation step and we explain how to set its parameters in practice. The main contribution of this work is to 
obtain conditions for stability of modified-CS and modified-CS-add-LS-del for recursive recovery of a time sequence of sparse 
signals. We compare the two results with each other and with the result for simple CS. Here and in the rest of the paper, simple 
CS refers to the solution of dU (this is often referred to as basis pursuit denoising). Under mild assumptions, and for two types 
of signal change models, we show that the support recovery error, and reconstruction error, of both algorithms is bounded 
by a time-invariant value at all times. The support error bound is proportional to the maximum allowed support change size. 
Under slow support change, this bound is small compared to the support size, making our result meaningful. Similar arguments 
can be made for the reconstruction error also. The assumptions we need are: mild restricted isometry conditions [8 1 on the 
measurement matrix; for any new element that is added to the support, either its initial magnitude is large enough or for the 
first few time instants, its magnitude increases at a large enough rate; and a similar assumption for magnitude decrease and 
removal from the support; appropriately set algorithm parameters; and a special start condition. 

Under the slow support change assumption, we can also argue that our results hold under weaker restricted isometry 
assumptions than what simple CS needs. Also, the result for modified-CS-add-LS-del holds under weaker assumptions on the 
initial nonzero magnitude or the magnitude increase rate than that for modified-CS. 

We do the above for two signal change models. The first is deliberately chosen to be a simple model that may often not 
be realistic. However, it helps to illustrate the key ideas of our results, and it allows for easy comparison of the results. The 
second model is a realistic, but more complicated, signal change model that allows different initial magnitudes; different rates 
of magnitude increase and of magnitude decrease at different times and for different elements; it also allows different numbers 
of support additions and removals at various times. We use MRI image sequences to demonstrate that this model is indeed 
valid for real data. The second model includes the first as a special case (see Remark [4j. 



B. Notation and Problem Definition 

We let [1, m] := [1, 2, . . . m\. We let denote an empty set. We use T c to denote the complement of a set T w.r.t. [1, m], 
i.e. T c := {i <G [1, m] : i $ T}. We use |T| to denote the cardinality of T. Also, denotes the empty set. The set operations 
U, n, \ have their usual meanings (recall that A \ B := A n B c ). If two sets B, C are disjoint, we just write VU B\C instead 
of writing (VUB)\C. 

For a vector, v, and a set, T, vj- denotes the |T| length sub-vector containing the elements of v corresponding to the 
indices in the set T. IMU denotes the £k norm of a vector v. If just \\v\\ is used, it refers to \\v\\2- Similarly, for a matrix 
M, ||M||fc denotes its induced fc-norm, while just ||A/|| refers to ||M||2. M' denotes the transpose of M and M* denotes the 
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Moore-Penrose pseudo-inverse of M (when M is full column rank, M' := (M'M) -1 M'). Also, Mj- denotes the sub-matrix 
obtained by extracting the columns of M corresponding to indices in T. 

We refer to the left (right) hand side of an equation or inequality as LHS (RHS). 

We assume the following observation model: 

y t = A t x t + w t , \\w t \\ < e (1) 

where xt is an m length sparse vector with support set Aft, i.e. Aft := {i : (xt)i ^ 0}; A t is a n t x m measurement matrix; 
y t is the n t length observation vector at time t (with n t < m); and w t is the observation noise. For t > 0, we fix n t = n. 

Our goal is to recursively estimate x t using y±, . . . y t . By recursively, we mean, use only y t and the estimate from t — 1, 
xt-i, to compute the estimate at t. 

Remark 1 (Why bounded noise): All results for bounding CS (CS solved using £\ minimization) error in noise, and hence 
all results for bounding modified-CS error in noise, either assume a deterministic noise bound and then bound \\x — x\, e.g., 
iflOl . l30l . |27l . OTI . l32l ; or assume unbounded, e.g. Gaussian, noise and then bound \\x — x|| with "large" probability, e.g. 
l33l . ll34l Sec IV], lfl2l Section III- A], j32l . The latter approach is not useful for recovering a time sequence of sparse signals 
because the error bound will hold for all times < t < oo with probability zero. 

To get a meaningful error stability result with unbounded, e.g. Gaussian, noise, one needs a way to compute or bound the 
expected value of the error at each time, e.g. compute E[(i t — x t )(x t — x t )'] or bound some norm of it. This is possible to 
do, for example, for a Kalman filter applied to a linear system model with additive Gaussian noise; and hence in that case, 
one can assume Gaussian noise and still get a time-invariant bound on the expected value of the error. However, for CS or 
modified-CS, there is no easy way to compute or bound the expected value of the error. Moreover, even if one could do this 
for a given time, it would not tell us anything about the support recovery error (for the given noise sequence realization) and 
hence would not be useful for analyzing modified-CS. 

As a sidenote, we should point out that, in most applications, the noise is typically bounded (because of finite sensing power 
available). One often chooses to model the noise as Gaussian because it simplifies performance analysis in many situations. 

C. More Notation 

For any matrix, A, the S'-restricted isometry constant (RIC) 0, Ss(A) is the smallest real number satisfying 

(l-^)||c!| 2 <p r c|| 2 <(l + ^)||c|| 2 (2) 

for all sets Tc [1, m] of cardinality |T| < S and all real vectors c of length |T|; the restricted orthogonality constant (ROC) 
0> 9si,S 2 (A), is the smallest real number satisfying 

\ Ci 'A Ti 'At 2 C2\ <0 Sl .,s 2 \\ci\\ |jc 2 || (3) 

for all disjoint sets 7i,72 C [l,m] with \Ti\ < Si, |72| < S2 and Si + S2 < m, and for all vectors a, c-i of length |7i|, \Ti\ 
respectively. 

In this work, we need the same condition on the RIC and ROC of all measurement matrices A t for t > 0. Thus, in the rest 
of this paper, we let 

5.5 ■= max<S s (A), and 6» Sl ,s 2 : = max6»5 liS2 (A t ). 

If we need the RIC of ROC of any other matrix, then we specify it explicitly. 

We use a to denote the support estimation threshold used by modified-CS and we use a a dd, o^ei to denote the support 
addition and deletion thresholds used by modified-CS-add-LS-del. In either case, we use J\f t to denote the support estimate at 
time t. 

Definition 1 (%, A t , A e .J: We use 7t := Nt-\ to denote the support estimate from the previous time. This serves as the 
predicted support at time t. We use A t :— Af t \ % to denote the unknown part of Aft and A e t '■= % \ Aft to denote the 
"erroneous" part of Aft- 

With the above definition, clearly, Aft = It U At \ A e t . 

Definition 2 (%, A t , A e .J: We use % ■= Aft to denote the final estimate of the current support; A t :~ Aft \ % to denote 
the "misses" in Alt and A e t :=7~t\Aft to denote the "extras". 

The sets %dd, A a dd, A e a dd are defined in Definition [3] which is given in the next section. 
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D. Other Related Work 

"Recursive sparse reconstruction" also sometimes refers to homotopy methods, e.g. 1351 . whose goal is to use the 
past reconstructions and homotopy to speed up the current optimization, but not to achieve accurate recovery from fewer 
measurements (than what simple CS needs). The goals in the above works are quite different from ours. 

Iterative support estimation approaches (using the recovered support from the first iteration for a second weighted l\ step 
and doing this iteratively) have been studied in recent work ll36l . ll37l . l38l . [|39l . This is done for iteratively improving the 
recovery of a single signal. 

E. Paper Organization 

The rest of the paper is organized as follows. The algorithms - modified-CS and modified-CS-add-LS-del - are introduced 
in Sec In] In Sec [TTT1 we give a very simple signal change model and derive stability results for modified-CS and modified-CS- 
add-LS-del under this set of model assumptions. In Sec lIVI we do the same for a realistic signal change model. The results are 
discussed in Sec IIII-Dl and llV-Dl respectively. In Sec[VJ we demonstrate that the signal model assumptions of Sec[IV]are indeed 
valid for medical imaging data. In Sec [VI] we explain how to set the algorithm parameters automatically for both modified-CS 
and modified-CS-add-LS-del. In this section, we also give simulation experiments that back up some of our discussions from 
earlier sections. Conclusions and future work are given in Sec I VIII 

II. MODIFIED-CS AND MODIFIED-CS-ADD-LS-DEL FOR RECURSIVE RECONSTRUCTION 

A. Modified-CS 

Modified-CS was first proposed in |[T3l as a solution to the problem of sparse reconstruction with partial, and possibly 
erroneous, knowledge of the support. Denote this "known" support by T. Modified-CS tries to find a signal that is sparsest 
outside of the set T among all signals satisfying the data constraint. In the noisy case, it solves min^ || (/3)r c II 1 s -t- \yt— < e - 
For recursively reconstructing a time sequence of sparse signals, we use the support estimate from the previous time, Aft-\, as 
the set T. The simplest way to estimate the support is by thresholding the output of modified-CS. We summarize the complete 
algorithm in Algorithm Q] 

At the initial time, t = 0, we let T be the empty set, 0, i.e. we do simple CS. Alternatively, as explained in |fT3l , we can 
use prior knowledge of the initial signal's support as the set T at t = 0, e.g. for wavelet sparse images with no (or a small) 
black background, the set of indices of the approximation coefficients can form the set T. This prior knowledge is usually not 
as accurate. 

Algorithm 1 Modified-CS 

For t > 0, do 

1) Simple CS. If t = 0, set % = and compute it,modcs as the solution of 

min 1108)11! s.t. \\y ~A p\\ <e (4) 

2) Modified-CS. If t > 0, set 7t = Aft-i and compute Xt.modcs as the solution of 

min ||(0)7;. ||i s.t. \\y t - A t (3\\ <e (5) 

3) Estimate the Support. Compute 7t as 

ft = {i e [l,m] : \(x t ,modcs)i\ > a} (6) 

4) Set Af t = ft- Output x t ,modcs. Feedback Alt. 



B. Limitation: Biased solution 

Modified-CS uses single step thresholding for estimating the support Aft- The threshold, a, needs to be large enough to 
ensure that all (or most) removed elements are correctly deleted and there are no (or very few) false detections. But this means 
that the new additions to the support set will either have to be added at a large value, or their magnitude will need to increase 
to a large value quickly enough to ensure correct detection within a small delay. This issue is further exaggerated by the fact 
that Xt,modcs is a biased estimate of Xt- Along T t c , the values of x t , m odcs will be biased towards zero (because we minimize 
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II {P)t£ II i)> while, along 7t, they may be biased away from zero. This will create the following problem. The set 7* contains 
the set A e which needs to be deleted. Since the estimates along A e may be biased away from zero, one will need a higher 
threshold to delete them. But that would make detection more difficult, especially since the estimates along A C will be 
biased towards zero. A similar issue for noisy CS, and a possible solution (Gauss-Dantzig selector), was first discussed in l33ll . 



C. Modified-CS with Add-LS-Del 

The bias issue can be partly addressed by replacing the support estimation step of Modified-CS by a three step Add-LS-Del 
procedure summarized in Algorithm [2] It involves a support addition step (that uses a smaller threshold - a a dd), as m 
followed by LS estimation on the new support estimate, 7add, as in (O, and then a deletion step that thresholds the LS estimate, 
as in (0. This can be followed by a second LS estimation using the final support estimate, as in ([Toi l, although this last step is 
not critical. The addition step threshold, a a dd, needs to be just large enough to ensure that the matrix used for LS estimation, 
Aji m is well-conditioned. If a a dd is chosen properly and if n is large enough, the LS estimate on %dd will have smaller error 
and will be less biased than the modified-CS output. As a result, deletion will be more accurate when done using this estimate. 
This also means that one can use a larger deletion threshold, a<j e i, which will ensure quicker deletion of extras. 

Related ideas were introduced in our older work [fi"2l . iTTTI for KF-CS and LS-CS, and in l40l . l30l for a greedy algorithm 
for static sparse reconstruction. 

We explain how to automatically set the parameters for both modified-CS-add-LS-del and modified-CS in Sec IVI-AI 



Algorithm 2 Modified-CS-Add-LS-Del 



For t > 0, do 

1) Simple CS. If t = 0, set % = and compute x t , m odcs as the solution of (U). 

2) Modified-CS. If t > 0, set % = Nt—i an d compute Xt,modcs as the solution of ©. 

3) Additions / LS. Compute 7Idd,t an d the LS estimate using it: 

A t = {i ■ \{x tl modcs)i\ > "add} 

r add ,t =r t \jAt (7) 
(xtMd)% M , t = Ar^vt, (x t Md)r^ t = o (8) 

4) Deletions / LS. Compute % and LS estimate using it: 

Tit = {i <E 7add,i : |(^i,add)i| < "del} 

ft = % dd j \ n t (9) 

5) Set Af t = ft- Feedback TV*. Output x t . 



[x t )f=A f }y u (40*0=0 (10) 



Definition 3 (Define T a dd,t, Aadd,t, ^-e,add,t)-' The set 7^dd,* is the support estimate obtained after the support addition step. 
It is defined in (0 in Algorithm [2] The set A a dd.t := Aft \ %dd,t denotes the set of missing elements from Aft and the set 
A eia dd,t := 7add,t \ Aft denotes the set of extras in it. We remove the subscript t where not needed. 



D. Modified-CS error bound at time t 

By adapting the approach of fTUl . the error of modified-CS can be bounded as a function of |T| = \Af \ + |A e | — |A| and 
|A|. This was done in (4TJ. We state a modified version here. 
Lemma 1 (modified-CS error bound): Assume that y t satisfies ([D and the support of xt is Aft- Consider step 2 of Algorithm 

Q]or|l If tf|Tt|+3|A t | = Vtl+|A e , t |+2|A t | < (\/2 - l)/2, then 

Mas* - x t , modcs \\ < ddTI + 3|A|)e < 7.50e, = ^ l+ J S . 

1 - 2d s 

For completeness, we provide a proof in Appendix [A] 

Notice that the bound by Ci(|T| + 3|A|)e will hold as long as (5|7-|+3|A| < 1/2- By enforcing that <5|7-|+3|a| ^ l/2c for a 
c < 1, we ensure that C\{.) is bounded by a fixed constant. To state the above lemma we pick c = \[2 — 1 and this gives 
Ci(.) = 7.50. We can state a similar result for CS iflOl . 
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Lemma 2 (CS error bound /f/Ol/): Let a; be a sparse vector with support Af and let y := Ax + w with \\w\\ < e. Let x, 
denote the solution of © with T = 0. If 8 2 \M\ < W% ~ 1)A then 

||x-x ca || < Ci(2|7V|)e < 8.57e 



E. LS step error bound at time t 

We can claim the following about the LS step error in step 3 of Algorithm [2] 

Lemma 3: Assume that y t satisfies ([T]i and the support of x t is Aft- Consider step 3 of Algorithm [2] 

1) (or t - it,add)r alld , t = (AT^Mr^J'^Ar^'wt + A %Mt 'A A ^ t {x t )A„ M J, (x t - x tMd ) AiM . t = (x t )A M>t , and (x t - 
it,add)i = 0, if i <£ %d±t U A addjt . 

2) a) \\(x t -x tMd ) %M J < - / ^ e + e -^^ \\(xt) AsM J. 

b) || " i t ,add)|| < + (1 + e,T ^ s ^ tl )\\(x t )A^J. 

Proof: The first claim follows directly from the expression for Xt^dd- The second claim uses the first claim and the facts 
that HVlb < 1/y/l -6 m , IKAt'At)-^ < 1/(1 - S m ) and ||i r ,A||2 < 6\ T \,\±\ El- 



III. Stability Results: Simple Signal Model 

The modified-CS algorithms do not assume any signal model. However for showing stability, we need certain assumptions 
on the signal change over time. In this section, we assume a very simple signal change model which allows us to illustrate 
the key ideas and allows for easy comparison of the results. 



A. Simple Signal Change Model 

This simple model uses a single parameter, r, for the newly added elements' magnitude and for the magnitude increase and 
decrease rate of all elements at all times. It also fixes the number of support additions and removals to be S a . 
Signal Model 1: Assume the following. 

1) (addition and increase) At each t > 0, S a new coefficients get added to the support at magnitude r. Denote this set by 
At- At each t > 0, the magnitude of S a coefficients out of all those which had magnitude (j — l)r at t — 1 increases to 
jr. This occurs for all 2 < j < d. Thus the maximum magnitude reached by any coefficient is M := dr. 

2) (decrease and removal) At each t > 0, the magnitude of S a coefficients out of all those which had magnitude (j + l)r 
at t — 1 decreases to jr. This occurs for all 1 < j < (d — 2). At each t > 0, S a coefficients out of all those which had 
magnitude r at t — 1 get removed from the support (magnitude becomes zero). Denote this set by TZf 

3) (initial time) At t = 0, the support size is S and it contains 2S a elements each with magnitude r, 2r, . . . (d — l)r, and 
(S - (2d - 2)S a ) elements with magnitude M. 

To understand the implications of the assumptions in Signal Model Q] we define the following sets. For < j < d — 1, let 

v t{j) ■= {i ■ \xt,i\ = jr, \x t -i,i\ = (j + l)r} 
denote the set of elements that decrease from (j + l)r to jr at time, t. For 1 < j < d, let 

2t(i) : = {* : \ x t,i\ = 3 r , \ x t-i,i\ = (j - l) r l 
denote the set of elements that increase from (j — l)r to jr at time, t. For 1 < j < d — 1, let 

S t (j) :={i:0<\xt ti \<jr} 

denote the set of small but nonzero elements, with smallness threshold jr. Clearly, the newly added set, At =Xt(l) and the 
newly removed set, K t = V t (0). Also, |I t (i)| = S a , \V t (j)\ = S a , \S t (j)\ = 2(j - l)S a . 

Consider a 1 < j < d. From Signal Model [T] it is clear that at any time, t, S a elements enter the small elements' set, St(j), 
from the bottom (set At) and S a enter from the top (set T> t (j — 1)). Similarly S a elements leave St(j) from the bottom (set 
IZt) and S a from the top (set I t (j)). Thus, 

St(j) - S t -i(j) U (A t UV t (j - 1)) \ (K f Ul f (j)) (11) 
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Since At,TZt,T> t (j — l),I t (j) are mutually disjoint, lZ t Q S t -i{j) and Xt(j) C S t -i(j), thus, ( fTTT i implies that 

5 4 _x(i) U.A t \ft t = 5 t (j) U2t(i) \ V t (j - 1) (12) 

Also, clearly, from Signal Model Q] 

Af t =Af t - 1 UA t \K t (13) 

-S. Stability result for modified-CS 

The first step to show stability is to find sufficient conditions for a certain set of large coefficients to definitely get detected, 
and for the elements of A e to definitely get deleted. These are obtained in Lemma |4] by using Lemma Q] and the following 
simple facts. Next, we use Lemma [4] to ensure that all new additions to the support get detected within a finite delay, and all 
removals from the support get deleted immediately. 

Proposition 1 (simple facts): Consider Algorithm Q] 

1) An i G Nt will definitely get detected in stepOif \{xt)i\ > a + \\x t - Xt, mo dcs \\<x>- 

2) Similarly, all i £ A e t (the zero elements of 7t) will definitely get deleted in step [3] if a > \\xt — £t. moc (cs||oo- 

In general, for any vector z, ||z||oo < ||z|| with equality holding only if z is one-sparse (exactly one element of z is nonzero). 
If the energy of z is more spread out, \z\ x will be smaller than ||z||. Typically the error xt — Xt^wdcs will not be one-sparse, 
but will be more spread out. The assumption below states this. 

Assumption 1: Consider Algorithm Q] Assume that the Modified-CS reconstruction error is spread out enough so that 

\\^t %t, modes || oo — /rr - 1 1 %t, 7nodcs\\ 

V <~>a 

for some (m < v / Sa- 

Combining the above proposition and assumption with Lemma Q] we get the following lemma. 

Lemma 4: Consider Algorithm Q] Assume Assumption Q] Assume that \Af t \ = Sj^/ t , |A e i| < 5a c t and |A t | < S& t . 

1) All elements of the set {i £ Af t ■ \(%t)i\ > bi} will get detected in step [3] if 

• <Ws Ac , t+2 s Af < 0.207, and b x > a + ^7.50e. 

2) In stepfJJ there will be no false additions, and all the true removals from the support (the set A e f) will get deleted at 
the current time, if 

• <W t +s Ae t +2S At < 0.207, and a > j^7.50e. 

We use the above lemma to obtain sufficient conditions to ensure the following: for some do < d, at all times, t, (i) 
only coefficients with magnitude less than dor are part of the final set of misses, A t and (ii) the final set of extras, A e t , 
is an empty set. In other words, we find conditions to ensure that A t C St (do) and |A e t | = 0. Using Signal Model Q] 
|5 t (d )| = 2(d - l)S a and thus A t C S t (d ) will imply that |A t | < 2(d - 1)5 U . 

Theorem 1 (Stability of modified-CS): Consider Algorithm Q] Assume Signal Model Q] on x t . Also assume that y t satisfies 
(HJ. Assume that Assumption [T] holds. If, for some do < d, the following hold 

1) (support estimation threshold) set a = -^M=7.50e 

2) (number of measurements) Ss + ^2k 1 +i)S a < 0.207, 

3) (new element increase rate) r > G, where 

a + 4^7.50e 

G = ^ e (14) 

4) (initial time) at t = 0, n is large enough to ensure that A C So{do), |A | < 2(d — l)«5a, |A e ,o| = and |7o| < S 
where 

fci = max(l, 2d - 2) (15) 

then, 

1) at all t > 0, \t t \ < S, \A e , t \ = 0, A t C S t (d ) and so |A t | < 2(d - l)S a , 

2) at all t > 0, \T t \ < S, |A e , t | < S a , and |A t | < kiS a , 

3) at all t > 0, \\x t - x t , modcs \\ < 7.50e 

Proof: The proof is given in Appendix [B] It follows using induction. 

Remark 2: The condition [4] is not restrictive. It is easy to see that this will hold if no is large enough to ensure that 
<MA)) < 0.207. 
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C. Stability result for Modified-CS with Add-LS-Del 

The first step to show stability is to find sufficient conditions for (a) a certain set of large coefficients to definitely get 
detected, and (b) to definitely not get falsely deleted, and (c) for the zero coefficients in %a to definitely get deleted. These 
can be obtained using Lemma [T] and simple facts similar to Proposition 03 

As explained before, we can assume that the modified-CS reconstruction error is not one-sparse but is more spread out. The 
same assumption should also be valid for the LS step error. We state these next. 

Assumption 2: Consider Algorithm [2] Assume that the Modified-CS reconstruction error is spread out enough so that, 

\\%t %t, modes || oo — r~— ^t, modes) 
V da 

at all times, t, for some (m < y/Sa- Similarly, assume that the LS step error along 7^dd,t is spread out enough so that 

Cl 

HO* - iadcMkdd.Joo < -J=\\(x t ~ X addtt )% M J 

at all times, t, for some (l < V~Sa- 

Combining the above assumption with Lemmas [T] and [3] we get the following lemmas. 

Lemma 5 (Detection condition): Consider Algorithm^ Assume Assumption [2] Assume that \Aft\ = 5V t , |A e .t| < SA ct , 
|A t | < SA t - Pick a bi > 0. All elements of the set {i € A : \{x t )i\ > &i} will get detected in step[3]if 
. 8 s „ t +s A . it +2S At < 0-207, and h > a add + ^7.50e. 

Lemma 6 (Deletion and No false-deletion condition): Consider Algorithm^ Assume Assumption [2] Assume that |7Idd,t| < 
S%u,t and l A add,*| < S AaMit . 

1) Pick a b\ > 0. No element of the set {i <G %dd,t ■ | (^t ) z | > bi} will get (falsely) deleted in step [4] if 

• S Sr iM , t <l/2 W dh>a iel +^(V2e + 29 s ^ t ,s^J(x t U M Jl 

2) All elements of A e a dd will get deleted in step [4] if 

• S Sr aM , < 1/2 and a de i > ^(V2e + 20^,^ \\x Am J). 

Using the above lemmas, we can obtain sufficient conditions to ensure that, for some do < d, at each time t, A t C St (do) 
(so that |A t | < (2d - 2)S a ) and |A e>t | = 0. 

Theorem 2 (Stability of modified-CS with add-LS-del): Consider Algorithm|2] Assume Signal Model [T]on x t . Also assume 
that y t satisfies ([T). Assume that Assumption [2] holds. If, for some 1 < do < d, the following hold 

1) (addition and deletion thresholds) 

a) a a dd is large enough so that there are at most / false additions per unit time, 

b) QJdel = yJl^CL£ + 2fo0s+S a +f,k2S a (Lr, 

2) ( number of measurements ) 

a) 8 s+Sa{1+2kl) < 0.207, 

b) S s+Sa+f < 1/2, 

c) s+Sa+fMSa < 54^7, 

3) (new element increase rate) r > max(Gi, G2), where 

A «add + -%7.50e 

1 — 1 

d 

G 2 a (16) 

yS a {do — Ak 3 9 s +s a +f,k 2 s a CL) 

4) (initial time) n is large enough to ensure that A C So{do), |A | < (2do — 2)S a , |A e | = 0, \%\ < S, 
where 

fci = max(l,2d - 2) 
&2 = max(0, 2do — 3) 



d -l d -2 

\ E i 2 + E i 2 ( 17 > 



then, at all t > 0, 
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1) \t t \ < S, |A M | - 0, and A t C S t (d ) and so |A t | < (2d - 2)S a , 

2) \%\<S, |A e , t | < S a , and |A t | < k x S a , 

3) ITadd.tl < S + S a + /, |A e , add , t | <S a + f, and |A add , t | < k 2 S ai 

4) \\x t - & t ,modcA\ <C 1 (S + S a + 2k 1 S a )e < 7.50e, 

5) \\x t -it || < 1.261fc 3 V^r + 1.12e. 
Proof: The proof is given in Appendix [C] 

D. Discussion 

Notice that, with Signal Model Q] at all times, t, the signals have the same support set size, | Aft \ = S and the same signal 
power, \\x t \\ 2 = {S-(2d~2)Sa)M 2 +2S a Y^Z{j 2 r 2 . 

The support error bound in both results above is proportional to S a . Thus, under slow support change, i.e. S a <C S, this 
bound is small compared to the support size S. Also, the reconstruction error is bounded below a constant times e. If signal 
power is large enough compared to the noise (high enough signal-to-noise ratio), this bound is also small compared to the 
signal power. 

To make the comparisons simpler, let us fix do = 2 and let / = S a in Theorem [2] Consider the conditions on the number 
of measurements. Modified-CS needs Ss+ss a < 0.207. Modified-CS-add-LS-del needs Ss+5S a < 0.207; Ss+2S a < 0.5 (this is 
implied by the first condition) and 6s+2S a ,s a < 4^7;- Since 0s+2S a ,s a < <5s+3S a i the third condition is also implied by the first 
as long as < 1.2. In simulation tests (described in Sec |IV-DT i we observed that this was usually true. Then, both modified-CS 
and modified-CS-add-LS-del need the same condition on the number of measurements: Ss+5s a < 0.207. Consider simple CS. 
Since simple CS is not a recursive approach (each time instant is handled separately), Lemma [2] is also a stability result for 
it. From Lemma [2] simple CS needs 62s < 0.207 to get the same error bound. Under the slow support change assumption, 
S a <C S. In this case, clearly simple CS requires a stronger condition than either of the modified-CS algorithms. 

Let us compare the requirement on r. In Theorem [2] for modified-cs-add-ls-del, since 9s+s a +f,k 2 Sa < \ 4k°c L » so ^2 < 
-^=£^e < %^ < < Gi and thus G^ is what decides the minimum allowed value of r. Thus, it needs r > Gi = 

VS^do da d a 1 1 — 1 

^■[«add + '^§=7 -50e]. On the other hand, modified-CS needs r > G = ^ [2-^=7. 50e]. If a add is close to zero, this means 
that the minimum magnitude increase rate, r, required by Theorem [2] is almost half of that required by Theorem Q] In our 
simulation experiments, a add was typically quite small: it was usually close to a small constant times e/y/n (see Sec WW. 

Remark 3: In the discussion above, we fixed do. Now let us see the reason for allowing do to be anything below d. If the 
rate of magnitude increase, r, is smaller, r > G\ or r > G will hold for a larger value of do. This means that the support error 
bound, (2do — 2)S a , will be larger. This, in turn, decides what conditions on the RIC and ROC are needed (in other words, 
how many measurements, m, needed). Smaller r means a larger do is needed which, in turn, means that stronger conditions 
on the RIC and ROC (larger n t ) are needed. Thus, for a given n t = n, as r is reduced, the algorithm will stabilize to larger 
and larger support error levels (larger do) and finally become unstable (because the given n does not satisfy the conditions on 
S, 8 for the larger do). 

IV. Stability Results: Realistic Signal Model 

We introduce the signal change model in the next subsection and then give the results in the following two subsections. 
Discussion is given in the following two subsections. 

A. Realistic Signal Change Model 

Briefly, our model assumes the following. At any time the signal vector xt is a sparse vector with support set Aft of size 
S or less. At most S a elements get added to the support at each time t and at most S a elements get removed from it. At 
time t = tj, a new element j gets added at an initial magnitude a,j, and its magnitude increases for the next dj > d m i n time 
units. Its magnitude increase at time r (for any tj < r < tj + dj is rj iT . Also, at each time t, at most S a elements out of the 
"large elements" set (defined in the signal model) leave the set and begin to decrease. These elements keep decreasing and 
get removed from the support in at most b time units. In the model as stated above, we are implicitly allowing an element j 
to get added to the support at most once. In general, j can get added, then removed and then added again. To allow for this, 
we let tj be the set of time instants at which j gets added; we replace dj by a,j t and we replace dj by dj t (both of which 
are nonzero only for t £ tj). 

As demonstrated in Section|V] the above assumptions are practically valid for MRI sequences. We specify the model precisely 
below. 
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Signal Model 2: Assume the following. 

1) At the initial time, t = 0, the support set, Afo, contains So nonzero elements, i.e. \Afo\ = Sq. 

2) At time t, S a> t elements are added to the support set. Denote this set by At- At time t, a new element j gets added to 
the support at an initial magnitude a,jt and its magnitude increases for at least the next d m - m > time instants. At time 
r (for t < t < t + (i m i n ), the magnitude of element j increases by r J T > 0. 

• djj is nonzero only if element j got added at time t, for all other times, we set it to zero. 

3) We define the "large set" as 

Ct := {j i Ut =t _ dmin+1 .A T : \{xt)j\>£\, 

for a given constant I. Elements in Ct-i either remain in Ct (while increasing or decreasing or remaining constant) or 
decrease enough to leave Ct. 

4) At time t, Sd,t elements out of Ct-i decrease enough to leave Ct~i- Denote this set Bt- All these elements continue to 
keep decreasing and become zero (removed from support) within at most b time units. Also, at time t, S r> t elements out 
of these decreasing elements are removed from the support. Denote this set by lZ t . 

5) At all times t, < S a>t < S a , < S d , t < minjS'a, |A-i|}» < S r , t < S a and the support size, S t := \Af t \ < S for 
constants S and S a such that S + S a < m. 

The above is not a generative model. It is only a set of assumptions on signal change. One possible generative model that 
satisfies these assumptions is given in Appendix |Gl 

Remark 4: It is easy to see that Signal Model [T] is a special case of Signal Model |2] with Oj,f = rj t t = r, d m m = d, b = d, 
So = S, S a .t = Sd.t = S r< t = S a , £ = dr. 

From the above model, the newly added elements' set At '■= Aft\Aft-i'< the newly removed elements' set !Z t '■= Aft—i \Nt, 
the set of elements that begin to start decreasing at t, Bt := Ct-i \ Ct- Define the following sets: the set of increasing (actually 
non-decreasing) elements at t, 

2i:={jeM:|(a:t)i|>|(^i)i|}; 
and the set of small and decreasing elements, 

SD t :=£?D|{iGM:0< \(x t )i\ < \(xt-i)i\}\. 

Notice that I t also includes j if its magnitude does not change from t — 1 to t. 

Condition 2 of the above model implies that (i) \At \ = S a ,u (ii) if j G -At-t (i.e. if j is added at t — to) for a to < rf m i n , 
then |(a; t )j| = a,j,t-t + 2~2 T =t-t +i r J. T ' anc ^ ^) — ^-t ^^t+i • • ■ nl t+ d min (all newly added elements increase for at least 
d m in time instants). 

Condition 3 implies that that Ct-i C Ct USVt- It also implies that (U t T=t _ d . +1 A T ) H Ct = 0. This, along with condition 
2 means that U t T=t _ d _ A T C I t . 

Condition 4 impHeTthat \B t \ = S d , t ; &-i \ Bt C C t ; SV t = SV t -i U B t \ K t ; E^i^t > El'Ji S d y, \SD t \ < 
Y J t T =t-b+i S <iy and \K t \ = S r ,t- 

Condition 5, along with the above, implies that \ST> t \ < bS a - 

Finally, it is easy to see that Aft = It U Ct U ST> t . The sets I t , Ct are not disjoint, but both are disjoint with ST) t . 

The above model tells us the following. Consider an element j that got added at time t, i.e. j € At- At t = t,t + 1, ...t + 
d m in — 1, j € It and j ^ £ T . At r = t + d m - in , j € I r ; if |(a; r )j| > I then j £ £ r as well. For r > t + d m i n , what 
happens depends on r — 1. If j € £ T -i, then either j g £ r or it decreases enough to enter the small and decreasing set, i.e. 
j G £> r C 52? T . If j G ST> T -i, then either it keeps decreasing or gets removed, i.e. either j 6 <S2? T or j € 7vL T C A/^. If 
j e r\I T -i, then, if |(o: T )j| > £ then j G £ T nl T , else j G £^ (1I T . 

We now discuss sufficient conditions for condition 5 of the signal model to hold. 

Remark 5: Since S t = S t -i + S a ,t ~ S r ,t = S + £* =1 S a , T - J^ =1 5 r , T , thus, S t < S holds if S < S and £* =i 5 a , T < 

r=l ^ti.T- 

Notice that an element j could get added, then removed and added again later. Let 

tj := {t : a 3 . t + 0} 

denote the set of time instants at which j gets added. Clearly, tj = if j never got added. Let 

0mm := min min a 7 - 1 

j:tj#0tetj,t#O 
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denote the minimum of aj_ t over all elements j that got added at t > 0. We are excluding coefficients that never got added 
and those that got added at t = 0. Let 

^min(rf) := rnin min min r-s T 

j:tj ^0 tetj ,t#o T G [t+l,t+d] 

denote the minimum, over all elements j that got added at t > 0, of the minimum of Tj <r over the first d time instants after j 
got added. 



Define 



(dmin). (18) 



With I defined this way, clearly, Af t = (U*_ t _ d . +1 A T ) U£t U ST> t where the three sets are mutually disjoint. 

Also, with £ as above, it is clear that for t > e? m i n , A = Ct-i U -4*-d min -i \ St, and for t < d m i n , £ t = Ct-i \ Bt- Here, 
by definition, £ t -i and At-d mio -i are disjoint and B t C £ t -i- Thus, 

|£t| = |A|+ S tt ,T-X}&,T 
t=1 r=l 

Also notice that \Cq\ < Sq. Using these facts and Remark |5] we can conclude the following. 
Remark 6: Let £ := a m in + d m i n r m i n (d m i n ) . Then, condition 5 of Signal Model [2] holds if 

1) < S a , t < S a and < S d , t < S a , 

2) (d min + b + l)S a < \C \ <S <S, and 

3) Er=l < Et=l S«J,r < I Col + El'Jl^' 1 S a , T . 

The leftmost lower bound of the second condition ensures that the upper bound of the third condition is not smaller than 
the lower bound. The upper bound of the third condition ensures that Sd,t < |A-i| always (it is actually written to ensure 
Sd.t-b < |jCf-6-i|)' So < S and the lower bound of the third condition ensures that St < S (as explained in Remark |5J. 
A simpler sufficient condition is as follows. 

Remark 7: Let I := a min + d m i n r m i n (d m i n ). Then, condition 5 of Signal Model [2] holds if (d m i n + 6+l)5 a < |>C | < 5 < S; 
S d<t = S a for all t; and for 1 < t < 6, S a , t = 0, and for t > b, S a . t = S a . 

In the above model, we only assume that all coefficients will get removed in at most b time units. However, it can happen 
that some coefficients get removed earlier than that and hence it is fair to include this in the signal model. We do this below. 

Signal Model 3: Assume Signal Model [2] with the following extra assumption. 

• Out of the Sd,t elements that started decreasing at time t, at least j^Sd.t of them get removed by t + r for r < b. 
All implications of the above model are the same as those of Signal Model [2] except that now, \ST> t \ < S^t + ^rS d 



■ b 



Sd,t-b+i < ^f-Sa, while for Signal Model [2] \SV t \ < bS a . 



B. Modified-CS Stability Result 

For the above signal model, we can claim the following. 

Theorem 3: Consider AlgorifhmQ] Assume Signal Model |3]on x t . Also assume that y t satisfies ([TJ. Assume that Assumption 
[TJ holds. If there exists a do < rf m in such that the following hold: 
1) algorithm parameters 



a) a = 4^7.50e, 



Cm 

2) number of measurements 

a ) * s +3(££i+<k+i)sr. ^ °- 207 ' 

3) initial magnitude and magnitude increase rate: 



t+d 

min min(aj t + \ . r j,r)} 



=t+i 



> a+ ^L7.50e, 

V Sa 

4) at t = 0, no is large enough to ensure that |A t | < ^^S a + d S a , |A ejt | = 0, 
then, for all t, 

i) |A t | < mi Sa + d S a , \A e . t \ = 0, \t t \ < s, 
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2) | A t | < ^S a + d S a + S a , \%\<S, |A e , t | < S a , 

3) and \\x t - x t \\ < 7.50e 
Proof: See Appendix |Pl 

Corollary 1: Under Signal Model [2] the result of Theorem [3] changes in the following way: replace S a by bS a 
everywhere in the result. 

Remark 8: Condition 4 of the above result is not restrictive. It is easy to see that it will hold if <!>2s(A)) < 0.207 and if 

\C \ > [So-i^Sa + doSa)]. 

Remark 9: A simpler sufficient condition for condition 3 is: min(£, a m - in + do > min 

C. Modified-CS-Add-LS-Del Stability Result 
Finally we study Modified-CS-Add-LS-Del. 

Theorem 4: Consider Algorithm^ Assume Signal Model |3]on x t . Also assume that y t satisfies ([T). Assume that Assumption 
H] holds. If there exists a do < d m i n such that the following hold: 

1) algorithm parameters 

a) <a a dd is large enough so that there are at most / false adds at time t, i.e. \At \ Mt \ < / 

b) a de i = i- 12 ^ + 0.261Cl/i, where h 2 = + d Q )(a, dd + ^7.50e) 2 

2) number of measurements 

a ) S s+3(^s a +d s a +s a ) ^ °' 207 



b) 6 s +s a +f < 0.207 

c) s+Sa+L o^ Sa+doSa < 0-207 
3) initial magnitude and magnitude increase rate: 



t+d 

min{£, min min(aj, f + N r j,r)} 

J J ^ J r=t+l 

Cm 

> max{« a dd + ^=7.50e, 2a de i} (19) 



4) at t = 0, no is large enough to ensure that |At| < ^4j^S a + doS a , |A ejt | = 0, 
then 

1) A t C SV t U A t U A-i • • • A-do+i 

2) |A t | < ( -^S a + d S a , \A e , t \ = 0, \t t \ < S 

3) \A t \<^S a + d S a + S a , \T t \<S 

4) ||x t - x t ,modcs\\ < 7.50e, 

5) ||x t -i t || < 1.12e+ 1.261^(^+11 + rf )(a del + 7.50e)5 a . 
Proof: See Appendix [E] 

Remark 10: Claims similar to Corollary Q] and Remarks [8] and |9] hold for the above result also. 



D. Discussion 

Remark 11: Consider the noise-free case, i.e. the case when e = 0. In this case, our results say the following: if RIP of 
order S + kS a holds with 8s+ks a < 0.207, and if the support thresholds are set to zero, then both algorithms will exactly 
recover all sparse signal sequences with support size at most S, and number of support additions and removals per unit time 
at most S a - It is easy to show that RIP (or actually left-RIP) of order S + S a at all times t > is also necessary for the above. 
We give a proof for this in Appendix [0 Thus the sufficient condition that our results need is only slightly stronger and clearly, 
cannot be improved much further since RIP(S' + S a ) is necessary. Thus, for example, RIP of order S + k'y/S2 or \/~S + k'S a 
will not work. This remark is inspired by a concern of an anonymous reviewer. 

Remark 12: Notice that Signal Models [2] or [3] allow for both slow and fast signal magnitude increase or decrease. Slow 
magnitude increase/decrease would happen, for example, in an imaging problem when one object slowly morphs into another 
with gradual intensity changes. Or, in case of brain regions becoming "active" in response to stimuli, the activity level gradually 
increases from zero to a certain maximum value within a few milliseconds (10-12 frames of fMRI data), and similarly the 
"activity" level decays to zero within a few milliseconds. In both of the above examples, a new coefficient will get added to 
the support at time t at a small magnitude a,j t and increase by rj iT per unit time for sometime after that. Similarly for the 
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decay to zero of the brain's activity level. On the other hand, the signal model also allows support changes resulting from 
motion of objects, e.g. translation. In this case, the signal magnitude changes will typically not be slow. As the object moves, 
a set of new pixels enter the support and another set leave. The entering pixels may have large enough pixel intensity and their 
intensity may never change. For our model this means that the pixel enters the support at a large enough initial magnitude a J t 
but its magnitude never changes i.e. r J/r = for all r. If all pixels exit the support without their magnitude first decreasing, 
then 6 = 1. 

The only thing that the above results (Theorem [3] and |4) require is that (i) for any element j that is added, either a,jt is large 
enough or rj )T is large enough for the initial few (do) time instants so that condition 3 holds; and (ii) a decaying coefficient 
decays to zero within a short delay, b. (i) ensures that every newly added support element gets detected either immediately or 
within a finite delay; while (ii) ensures removal within finite delay of a decreasing element. For the moving object case, this 
translates to requiring that Oj,t be large enough. For the first two examples above, this translates to requiring that r JjT be large 
enough for the first few time instants after j gets added and that b be small enough. 

Recall that 6s ■= max t> o 5s{A t ). Other than the above assumption, the results also need that the support estimation 
thresholds are set appropriately; enough number of measurements, ro t , are available at all times t > so that condition 2 holds 
(this number depends on the support size, S, the support change size, S a and on 6); and condition 4 holds. 

For the above results, the support errors are bounded by a constant times S a . Thus, under slow support change, the bound 
is small compared to the support size, S t , making the above a meaningful result. The reconstruction error is bounded by a 
constant times e. Under high enough SNR, this bound is small compared to the signal power. In fact, for Signal Models [2] or 
[3] the signal power is not bounded. To compare the results, let us fix some of the parameters. Suppose that 6 = 3,/ = S a , 
So = S, S a .t = S r .t = Sd.t = S a . Let do = 2. The modified-CS result says the following. If 

1) S s+ i5 Sci < 0.207, and 

2) LHS of condition 3 > 4^15e, 

then |A t | < AS a and |A ej t| = and \\xt — Xt : modcs\\ < 7.50e. The Modified-CS-add-LS-del result says the following. If 

1) <5s+i5s a < 0.207(the other two conditions are implied by this), and 

2) LHS of condition 3 > max(a add + ^=7.50e, 2.24^=e + 0.522( L h), where h 2 = 4(a add + -^7.50e) 2 . 

then | Aj | < iS a and |A M | = and \\x t - x t . m odcs\\ < 7.50e. 

The CS result from Lemma [2] says the following. If 

1) 5 2S < 0.207 
then \\x t — x t cs \\ < 8.57e. 

Thus, both modified-CS and modified-CS-add-LS-del need the same restricted isometry condition (condition on the number 
of measurements). Under the slow support change assumption, S a <C St < S. In this case, both the modified-CS algorithms 
hold under a weaker restricted isometry condition (potentially fewer number of measurements required) than what simple CS 
needs for the same reconstruction error bound. 

Next we compare the lower bounds on the LHS of condition 3 needed by modified-CS and by modified-CS-add-LS-del. This 
requires knowing Cm and Cl- To get an idea of the values of Cm and Cl, we did simulations based on Signal Model [2] with 
S = 0.1m, S a ,t = Sd.t = S r ,t = S a = 0.01m, 6 = d m in = 3, r^t = 1, Oj,t = 1 (we generated it using the generative model 
given in Appendix A of JT|). The measurement matrices At were zero mean random Gaussian nt x m matrices with columns 
normalized to unit norm. For t = 0, no = 160; for t > 0, n t = n = 57. The measurement noise, (w t )j ~ l - l - d - uniform(—ct, c t ) 
for 1 < j < m. For t = 0, Ct = 0.01266; for t > 0, C{ = 0.1266. We used the same measurement Gaussian matrix A for 
t > 0. We generated 500 realizations respectively with different choices of m, and used both algorithms for reconstruction. 
When m = 200, we got, Cm = 0.9328^, Cl = 0.8734V^; when m = 1000, Cm = 0.8295^, Cl = 0.8628\/5^; when 
to = 2000, Cm = 0.8497V^, Cl = 0.8628^. 

For our comparison, we pick the largest values we got from the above experiment: let Cm = 0.9328v / 3 i a and 
Cl = 0.8734 V / 5I. With these values, modified-CS needs LHS of condition 3 > 13.99e and modified-CS -Add-LS -Del needs 
LHS of condition 3 > max{a add + 7.00e, 10.978e + 3.246a add } = 10.978e + 3.246a add . With a add small enough, clearly 
modified-CS-add-LS-del requires a weaker assumption. As explained earlier and also in (fl~), a add is a small threshold that is 
typically proportional to the noise bound c, i.e., e/y/n. Thus the mod-CS-Add-LS-Del condition is weaker. 

The comparison between modified-CS and modified-CS-add-LS-del above is not as clear-cut as that in the simple model 
case (Signal Model []]). The reason is that the simple model tells us exactly how many support additions and removals occur 
at each time; and it also tells us the exact number of elements with a certain magnitude. As a result, it is possible to get a 
better bound on \\x\ t aja 1 1 2 : this is needed to bound the LS step error. The LS error decides the value of a de i and a de i, in turn, 
decides the lower bound on the LHS of condition 3. The current models Signal Model [2] or [3] are much more flexible, but 
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this also means that they not give us exact magnitude information. As a result, the bounds are looser and so the advantage of 
modified-CS-add-ls-del is not demonstrated as clearly. 

The reason for allowing do to take any value is the same as that given in Remark [3] in Sec IIII-DI 

Remark 13: Finally, we explain why condition la of Theorem |4] is stated the way it is. Because of how the modified-CS 
error is bounded, we cannot get a bound on the reconstruction error for the j th coefficient, \(xt)j — (%t)j\- We can only 
bound this error by its infinity norm. Thus, the only way to get an explicit value for a a dd is to let it equal the upper bound on 
\\%t — (xt\\oo and this will ensure / = false adds. However, the key point of the add-LS-del procedure is that one can pick 
an addition threshold that is smaller than this but results in some false adds, /. As long as / is small enough so that At m is 
well conditioned (condition 2b holds), the LS step error will be much smaller. With oy e i chosen appropriately, one can still 
delete all of these false adds (as well as all elements of the removed set) in the deletion step. 

E. Comparison with the LS-CS result of M2V 

In lfl2ll . we obtained a stability result for LS-CS which was a worse algorithm than modified-CS: it required stronger 
conditions for exact recovery, and was worse is simulation experiments as shown in lfl"3l . 0]. The same signal model and the 
same strategy as that of lfT2l can be used for modified-CS as well and we will, in fact, get a stronger stability result for it: the 
modified-CS result will not need condition 3b of the LS-CS stability result (Theorem 2 of ifLTl ). 

The most important difference between the LS-CS result from [12] and our results is that fTZ] assumed S a support changes 
every p frames and the result required a lower bound on p. With this, one could ensure that all newly added support elements 
got detected before the next support change time. This meant that one could delete the false adds and removals after all new 
adds got detected, but before the next change time. At this time, the signal recovery is very accurate (because of zero misses) 
and hence, for the result of |[T2l . a very small deletion threshold could suffice. However, as explained earlier (see Fig [TJ, 
support change every so often is not a practically valid assumption in most applications. In this work, we allow the support to 
change at every time which is more realistic, but is also more difficult to analyze. With this, one always has some misses at 
each time instant (except in the simplest case where all new elements are added at very large magnitudes). Thus, one cannot 
wait for all the missed elements to get detected before deleting the false adds and removals and hence one requires a larger 
deletion threshold. 

A third difference is that the signal change model of lfl2l fixed the number of support additions and removals at each time 
to be just S a ; it fixed the initial magnitude and the rate of magnitude increase for a new support element j to both be a,j at 
all times; and, for decreasing coefficients, it assumed a very specific and fixed rate of magnitude decrease. None of these is a 
very practical assumption. Our realistic signal change models (Signal Model [2] or [3]l do not fix any of these things. 

V. Model Verification 

We verified that two different types of MRI image sequences - a larynx (vocal tract) MRI sequence and a brain functional 
MRI sequence - do indeed satisfy Signal Model [2] First we describe model verification for the larynx sequence. We used 
a 10 frame sequence and extracted out a 36x36 region of this sequence selected as the region that includes the part where 
most of the changes were visible. As shown in earlier work H131 . this sequence is approximately sparse in the 2D discrete 
wavelet transform (DWT) domain. A two level db4 wavelet was used there. We computed this 2D DWT, re-arranged it as a 
vector and computed its 99.9% energy support set. All elements not in this set were set to zero. This gave us an exactly sparse 
sequence Xt. Its dimension m = 36 2 = 1296. For this sequence, we observed the following. The support size Aft satisfied 
\Aft\ < S = 113 for all t. The number of additions from t — 1 to t satisfied \Aft \ Aft~i\ < 21 and the number of removals, 
\Aft-i \Aft\ < 26. Thus, S a — 26. Also, the initial nonzero value, dj t t, ranged from 13 to 37, the rate of magnitude increase, 
rj.t, ranged from 1 to 37, and the duration for which the increase occurred, dj t, ranged from to 4. Also, the maximum delay 
between the time that a coefficient began to decrease and when it was removed was b = 7. 

Next we consider a 64x64 functional MRI sequence. fMRI is a technique that is used to investigate brain function. The 
sequence we study here is for the brain responding to a certain type of stimulus (light being turned on and off). This sequence 
consisted of a rest state brain sequence to which activation was added based on the models suggested in (42) . The goal is to 
be able to accurately extract out the activation region from this sequence. As is done in |fl9l , one can use the undersampled 
ReProCS algorithm to extract out the sparse activation regions from the low rank background brain image sequence, as long as 
an initial background brain training sequence is available. In our example, the activation started at frame 71. For the purpose 
of ReProCS, the active region "image" (the image that is zero everywhere except in the active region), is the sparse signal 
of interest. For a 23 pixel region that is known to correspond to the part of the brain that is affected by the above stimulus, 
the activation was added follows l42l . The 23 pixel region was split into 2 sub-regions so that the activation intensity was 
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smallest at the boundary of the region and slowly increased as one moved inwards. We show the 2 regions in Fig |2(b)| TZi 
is the innermost region, IZ2 is the outermost. The activation in these regions satisfied the following model. For j £ TZi, 
(xt)j = b(t)M a . For j £ 7Z.2, (%t)j = 0-2b(t) 2 M a . Here M a = 1783 is the maximum magnitude in the active region and b(t) 
is the blood oxygenation level dependent (BOLD) signal taken from B2l . It is plotted in Fig |2(a)| This image sequence was 
of size 64x64, i.e. its dimension m = 64 2 = 4096. We computed its 99.9% energy support and set all elements not in this 
set to zero. This gave us our sparse sequence x t . The support size of x t , Aft, satisfied \Af t \ < S = 23 for all t. The number 
of additions from t — 1 to t satisfied \J\f t \Aft-i\ < S a = 13 and the number of removals, \Aft-i \ A/"t| < 5 a = 13. Also, the 
initial nonzero value, ajj, ranged from 57 to 97, the rate of magnitude increase, rj, t , ranged from 1 to 637, and the duration 
for which the increase occurred, djt, ranged from 6 to 7. Also, the maximum delay between the time that a coefficient began 
to decrease and when it was removed was 6 = 7. 
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(a) BOLD (b) Active and transient region 

Fig. 2: (a): plot of the BOLD signal and of its square, (b): active, transient and inactive brain regions 



VI. Setting Algorithm Parameters and Simulation Results 
A. Setting algorithm parameters automatically 

Algorithm Q] has one parameter a. Algorithm |2] has two parameters a a dd, ct&h We explain here how to set these thresholds 
automatically. It is often fair to assume that the noise bound on e is known, e.g. it can be estimated using a short initial noise- 
only training sequence. We assume this here. In cases where it is not known or can change with time, one can approximate it 
by ||yt_i — A t -iXt-i\\2 (assuming accurate recovery at t — 1). 

Define the minimum nonzero value at time t, Xmin.t = mmj<z_\f t \(xt)j\- This can be estimated as x m in,t — 
min je%-i l(&-0jl- 

When setting the thresholds automatically, they will change with time. We set a a dd,t using the following heuristic. By Lemma 
E we have {x t - ^t.add)^,* = i A T^J A T ^ t )- l [A Tm Jw t + A T ^J A^{x t )^ t \- To ensure that this is bounded, we need 
II^TiM^I and \\{A Tai J A Tm , t )- l \\ to be bounded. Since \\A %M J\\ = ^—r^-j and \\(A %M t ' A %M J" 1 !) = ^.jA^ t) , 
we pick a a dd,t as smallest number such that &min{A% M t ) > 0.4. 

If one could set oyei equal to the lower bound on x m - ln: t — \\ (xt — £t,add)r a d,i t l|oc> there will be zero misses. Using this idea, 
we let adei.i be an estimate of the lower bound of this quantity. Notice that 

||(zt-£t,add)r ailM ||oo < \\{Ar m ,t A ^ x tA^ +^7^,^*11°° 

addlloo 4" || A Jl Mt W t || OO 

where C\, C2 are some constant larger than 1. Here we use the fact that for any matrix B, ||S||oo < Cill^ll f° r some constant 
C\ and that only small elements are missed and hence we can approximate Hxt.A^d lloo by C2 times i m i nj t where C2 is a 
small constant larger than 1. We cannot compute 9yj- M t |,A ad d' but it is fair to assume that it is small (significantly smaller than 
one). If we assume that 

CiC 2 ||(^radd, t '^radd, t )~ 1 ||oo^|r a dd, t |,|A a j d | < 0.3, 
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then the above bound simplifies to 0.3x m in,t + H-A^ Wt||oo. We can approximate Wt by y t — ^4£t !mo d cs . Thus, we set ay e i,t = 
0.7x m ; ni t — || Aj- (yt — A.Tt, mo[ jcs)||oo- 

For Algorithm [TJ we set at as follows. If \\xt — £t,modcs||oo < Cx m - ln .t for some C < 1, then setting a t = (1 — C)x m i n ,t 
will ensure that there are no misses. If this bound holds for most entries i, then most entries will be correctly recovered, i.e., 
there will be few misses. If we ensure <r m i n (Aj- t ) > 0.4 then the number of extras will be bounded. To try to ensure that 
both the above hold, we let at to be the smallest value such that min je ^ | (xt, mo dcs )j\j > (1 — C)x m i n , t = 0.5.T m i n t (we pick 
C = 0.5), and <j min (A ft ) > 0.4. 

To get a more robust estimate of the minimum nonzero value of x t , we use a short-time average of {x m i n ,r , t — to < r < t] 
as the estimate of x m i n ,t- In our experiments, to = 10. 

B. Simulation Results 

In the discussion so far, we only compared sufficient conditions required by different algorithms. The general conclusion 
obtained by comparing the sufficient conditions was that modified-CS-add-LS-del is the best algorithm followed by modified- 
CS and then simple-CS. In this section, we use simulations to demonstrate the same thing. We compared simple CS (solves 
© at each time instant), modified-CS(mod-CS) as given in Algorithm [TJ and modified-CS-add-LS-del (mod-CS-Add-LS-Del) 
as given in Algorithm [2] The parameters for the algorithms were set as explained in Sec IVI-AI above. 

The data was generated as follows. We used Signal Model [2] generated as explained in Appendix iGl with m = 200, S = 20, 
d m in = 3, a min = r min (d min ) = r, S a = 2, b = 3, £ = a min + d min r min (d min ) = 4r and r was varied. The measurement 
matrices A t were zero mean random Gaussian n t x m matrices with columns normalized to unit norm. We used no = 160 
and n t = n = 57 for t > 0. The measurement noise, (wt)j uniform(—ct, Ct) for 1 < j < m. For t = 0, Ct = 0.01266; 

for t > 1, Ct = c = 0.1266. In the first set of experiments shown in Fig. [3] we used the same measurement matrix A t = A 
for all t > 1. In the second experiment shown in Fig. |4] A t was time varying. 

The normalized mean squared error (NMSE), E g^j ^ , the normalized mean extras, E ^^'^ , and the normalized mean 

misses, ^jj^ffl are used to compare the reconstruction performance. Here E[.] denotes the empirical mean over the 500 
realizations. Consider the results of Fig [3] Clearly, both mod-CS and mod-CS-Add-LS-Del significantly outperform simple 
CS. This is because for t > 0, the number of measurements, n t = 57 is too small for a 200 length 20 sparse signal. When 
a min = r min(rfmm) = r is large enough, both mod-CS and mod-CS-Add-LS-Del are stable at 5% error or less. When r is 
reduced, mod-CS becomes unstable. Of course when r is reduced even further to r = 0.2, both become unstable (not shown). 
If Fig 21 we show results for the case when A t changes with time and all other parameters are the same as Fig [3] (a). Clearly 
in this case, the performance of both mod-CS and mod-CS-add-LS-del has improved significantly. 

In Fig. [5] we plot the average value of a a dd,t for the simulations corresponding to Fig [4] As can be seen, this threshold is 
close to 4c = 4e/ v / n at all times. 

For solving the minimization problems given in (0) and (0, we used the YALL1 software, which is provided in 
http://yalll.blogs.rice.edu/. Both the modified-CS algorithms and simple CS took roughly the same amount of time. For the 
results of Fig. |4] when running the code in MATLAB on the same server, simple CS needed 0.0466 seconds per frame; mod-CS 
needed 0.0432 seconds per frame and mod-CS-Add-LS-Del needed 0.0517 seconds. These numbers are computed by averaging 
over all 500 realizations and over the 200 time instants per realization. 
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Fig. 3: Error Comparison with Fixed Measurement Matrix 
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NMSE Comparison r m i n [d m in) =0.3 n=57 
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Fig. 4: Error Comparison with Time Variant Measurement Matrices 
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Fig. 5: mean of a at jd over time 




VII. Conclusions and Future Work 

In this work we obtained performance guarantees for recursive noisy modified-CS which has been shown in earlier work 
to be a practically useful algorithm CPU . Il43l . Il28l . We show that, under a realistic practically valid signal model and mild 
assumptions - a lower bound on either the initial nonzero magnitude or the magnitude increase rate; some RIP conditions 
(which imply conditions on the required number of measurements); appropriately set algorithm parameters; and a special start 
condition - the support and signal recovery error of modified-CS and its improvement can be bounded by time-invariant and 
small values. 

The special start condition is a possible limitation of our analysis. This can be removed in various ways. If some prior 
knowledge about signal support is available, that can be used at t = as suggested and demonstrated in ff3l . Or, one can 
solve a batch problem (multiple measurement vector (MMV) problem) for the first set of k frames. If we let Af = ^ =1 Aft, 
then we have an MMV problem with row support Af that can be solved using mixed norm minimization fi4l . simultaneous- 
OMP ED, J46|, compressive MUSIC El, iterative MUSIC (48), block sparsity approaches Eg) or M-SBL (Sparse Bayesian 
Learning) [50|. In this case one could adopt guarantees for the chosen batch method for the initialization. 

In this work, we used a deterministic set of assumptions on signal change. Notice however that one can assume any 
probabilistic model that ensures that a^t > a m in and r_,. r is anything larger than r m ; n ((io) for for the first do frames after 
a new addition; and at later times, rj, T can be anything between zero and infinity. Similarly, any probabilistic model for 
coefficient decrease that ensures removal within at most b frames after decrease begins will suffice. We can fix do to be any 
integer between zero and d m ; n and our result will then hold for that particular value of d$. 

Other ongoing and future work includes designing and analyzing better support prediction techniques rather than just using 
the previous support estimate as the prediction for the current support. Some initial ideas are presented in ll5D . 



Appendix 

A. Proof of Lemma Q] 

We provide the proof here for the sake of completion and for ease of review. This will be removed later. Let h := x mo dcs — %■ 
We adapt the approach of lHol to bound the reconstruction error, \\h\\ := \\x mo dcs ~ x \\- A similar result was obtained in ll27l . 
Let Ai denote the set of indices of h with the |A| largest values outside of TU A, let A2 denote the indices of the next |A| 



largest values and so on. Then using the same approach as that of IflOl , i.e., ||/ia, || < -^H^Aa-iHii 

IIVuAuA^II < || < -^=11 VuA )c ||i (20) 

j>2 VI A I 

Since x mo( i cs = x + h is the minimizer of (0 and since both x and x mo d cs are feasible; and since x is supported on M C TU A, 

||^a||i = ||a>Hli > IK 2 -' + ti)T°\\i 

> \\xa\\i - \\h A \\i + ||^(TUA)-I|l (21) 

Thus, 

IIVuA).= l|i < II ^a Hi (22) 
Combining this with (|20| |. and using ^' A ^ < \\h A \\, we g et 

llfynjAuAO-ll <X>aJ < I^aII (23) 

Next, since both x and x mo dcs are feasible, 

||A/l|| = \\A(X - X m odcs)\\ 

< \\y- Ax\\ + \\y - Ax modcs \\ < 2e (24) 

In this proof, let 

S - V|+3|A| ( 25 ) 

Now, we upper bound ||/ituAuAiH- By ^|T|+2|A| < 8, we have 

(I-^H/ituAuaJ 2 < II^tuAuaJI 2 (26) 
To bound the RHS of the above, notice that Ahj- uAuAl = Ah — J2j>2 Ah Aj and so 

II^TuAuaJI 2 = (Ah TuAuAl ,Ah) - ^ (Ah TuAuAl , Ah Aj ) 

i>2 

Using d24l i and the definition of 5s given in <j2j and <W|+2|A| < 5, 

\(Ah TuAuAl ,Ah)\ < 2ex/TT5||/ lr uAuA 1 || (27) 

Using the definition of 9s u s 2 given in (0; equation (123k and the fact that \\hj-\\ + \\h AuAl \\ < \/2j|^TuAuAi 1 1 , we get the 
following. Using 6\r\,\A\ < <5|T|+|A| < 5|T|+3|A|, 02|A|,|A| < $3\A\ < *|T|+3|A| 0' 

I^^TUAuAi^/lA,-)! 

< ^|T|+2|A|,|A|||^ruAuA 1 || 2J II^AjH 

i>2 

< ^IVuAuaJI \\h A \\ (28) 
Combining the last six equations above, using \\h A \\ < ||/ituauAi||, we can simplify the above to get 

INI < 2|| VuauaJ < 

<- W'- 

Clearly, all of the above discussion holds only if the RHS is positive which is true only if 2<5|7-| +3 |A| < 1. Thus, we can 
get Lemma [T] 
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B. Proof of Theorem Q] 

We prove the first claim by induction. Using condition [4] of the theorem, the claim holds for t = 0. This proves the base 
case. For the induction step, assume that the claim holds at t — 1, i.e. |A e ,{-i| = 0, \Tt-i\ < S, and At-i ^= St-i(do) so that 
|A t _i| < 2(do—l)S a - Using this we prove that the claim holds at t. In the proof, we use the following facts often: (a) IZt C Aft-i 
and At Q A/" t c _ 1; (b) Af t = Aft-iUAt\TZ u and (c) if two sets B,C are disjoint, then, DUC\B := (DUC)\B = (DC)B C )UC 
for any set D. 

We first bound \%\, |A ejt |, |A t |. Since = tt-i =&t-x, so |7^| < S. Also, A M = Af t -x \Af t =A/" t _ 1 n [(A/£_i ^ -^t) u 
72. t] C A e j-i U 72( = 72*. The last equality follows since |A e .t_i| = 0. Thus |A e ,t| < |72f| = S a . 

Consider |A f |. Notice that A t = Af t \ At-i = (M-i H A^Li n 72?) U (A n A/" t c _i) = (A t _i n 72?) U (A n A/" t c _i) C 
(5 t _i(do) n TZ L t ) UA t = S t -i(do) UA t \ 72 t . Here we used A t -i C S t _i(d ). When d > 2,72* C 5 t _i(do) and A is 
disjoint with St-i(do). Thus |A t | < |St-i(do)| + |A| - |72 t | = 2(d - 1)5„ + S* a - When d = l,S t -i(d Q ) = 0, and A t 
is disjoint with 72*. Thus |A t | <\A t \Kt\ = \M = S a . Thus, |A t | < hS a . 

Next we bound |A t |, |A e t |, \%\. Consider the support estimation step. Apply the first claim of Lemma |4] with SV = S, 
Sac = S a , Sa = kiS a , and b\ = dor. Since conditions [2] and [3] of the theorem hold, all elements of Aft with magnitude equal 
to or greater than dor will get detected. Thus, A t C St(do). Apply the second claim of the lemma. Since conditions [2] and [TJ 
hold, all zero elements will get deleted and there will be no false detections, i.e. |A e t | = 0. Finally, \Tt\ < |A/t| + |A e t | < 5+0. 

The second claim for time t follows using the first claim for time t — 1 and the arguments from the paras above. The third 
claim follows using the second claim and Lemma [TJ 

C. Proof of Theorem [2] 

We prove the first claim of the theorem by induction. Using condition |4] of the theorem, the claim holds for t = 0. This proves 
the base case. For the induction step, assume that the claim holds at i.e. |A e .t_i| =0, \7t-i\ < S, andA t _i C St-i(do) so 
that | At_i | < 2(do — l)S a . Using this, we prove that the claim holds at t. We will use the following facts often: (a) 72t C Aft-i, 
(b) At C Aft_ lt (c) Af t =Aft-iUA t \1lu and (d) if two sets B, C are disjoint, then, DUC\B := (DUC)\B = (DC)B C )UC 
for any set D. 

The bounding of |7t|, \A t \, |A e t | is exactly as in the proof of Theorem [TJ Since 7t = so \%\ < S. Also, A e t = 

Aft-i \ Aft = Aft-i n [(Af^ i n At) U n t ] C A e ,t-i U TZ t = 72 t . Thus |A e>t | <\Ht\ = S a . Finally, A t = Aft \ Af t -i = 
{A t -! n 72?) U (At n Af£_i) C (S t -i(d ) n 72?) U A t . Thus, 

A t gSt-i(d )uAt\llt (30) 

When d > 2, 72 4 C S t -i(d ) and A t is disjoint with <S t _i(d ), so |A t | < \S t -i(d )\ + \A t \ - \TZ t \ = 2(d - l)S a + S a - S a . 
When d = l,<S t _i(d ) = 0, and A t is disjoint with 72 t , so |A t | < \A t \ 72 t | = \A t \ = S a . Thus, |A t | < hS a . 

Consider the detection step. There are at most / false detects (from condition [Tall and thus | A e a dd,t| < | A e t | + f < S a + f. 
Thus |7Idd,t| < Wt\ + |A ei «H, t | <S + S a + f. 

Next, consider |A ac jd,t|. Notice that 

A t cS t -i(do)uAt\K t 

CS t (do)Ul t (d )\V t (do-l). (31) 

The first C is from (l30l l. the second one follows by using (TT2T > for j = do- Now, apply Lemma with SV f = S, Sa c t = S a , 
S& t = kiS a , and with b\ = dor. Using OTb . {i 6 A t : |(^t)»| > ^1} = At nit (do). Since conditions [2] and [3] hold, by 
Lemma|5] all elements of {i £ A t : |(xt)i| > b\] will definitely get detected at time t. Thus A a dd,t Q A t \ {i £ A t : |(xt)j| > 
61} C A t \ I t (d ). But from (gB, A t \ I t (d ) C S t (d ) \ T> t (d - 1). Since when d > 2, V t (d - 1) C 5 t (d ). then 
IAadd.il < |5 t (d )| - |23*(do - 1)1 = 2(d - 1)S - S a ; when d = l,V t (d Q - 1) = 5 t (d ) = 0, then |A adtM | = 0. Thus, 
I A ad d,i| < k 2 S a 

Consider the deletion step. Apply Lemma [6] with Sj aiit = S, SA llMt = k\S a . Since condition |2b1 holds, 5s+s a +f < 1/2 
holds. Since A a dd,t Q St(do) \ T> t (do — 1), A a dd,t contains only 2S a elements of magnitude {r, 2r, ■ ■ ■ , (do — 2)r} and S a 
elements of magnitude (do — l)r. Thus, || (xt)A lM t \\ < k^^/S^r. Using these facts and condition [Tbl by Lemma|6l all elements 
of Ae jad d,t will get deleted. Thus |A e , t | = 0. Thus \%\ < \Af t \ + \A e , t \ < S. 

To bound |A(|, apply Lemma [6] with S% Mt = S + S a + /, SA mi = k 2 S a , b\ = dor. By Lemma [6] to ensure that all 
elements of {i g 7add,t : 1(^0*1 — ^1} do not get falsely deleted, we need Js +s„+/ < 1/2 and dor > o^ei + "^=(v / 2 £ + 
20 So +s a +fMSak?,VSlr). From condition [Tb] a de i = J ^Cl£ + 2k 3 8 s+ s a +f\k 2 s a (Lr. Thus, we need S So +s a +f < 1/2 and 
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dor > 2(«/ -^-Ql^ + ^k 3 9s+s a +f t k 2 SaCL r )- ^s +s a +f < 1/2 holds since condition [2bl holds. The second one holds since 
condition l2cl and r > G2 of condition [3] hold. Thus, we can ensure that all elements of {i € %dd,t '■ \( x t)i\ > i- e - 
all elements of 7^dd,t with magnitude greater than or equal to b\ = dor do not get falsely deleted. But nothing can be said 
about the elements smaller than dor (in the worst case all of them may get falsely deleted). Thus, A t C St(do) and so 
|A t | <2(do-l)5 Q . 

This finishes the proof of the first claim. To prove the second and third claims for any t > 0: use the first claim for t — 1 
and the arguments from the paragraphs above to show that the second and third claim hold for t. The fourth claim follows 
using the previous claims and Lemma [TJ The fifth claim follows using previous claims, Lemma [3] and a bound on IKx^aJU- 
It is easy to see that IK^t^Jh < k^^/S^r. 



D. Proof of Theorem \3\ 

Recall from the signal model that \Af t \ < S for all t, and that |<SD t | < &^-S d . Also Af t = U* =t _ dmta+1 ^ r U C t U SV t , 
noting that the first two sets might not be disjoint. 

The proof follows using induction. The base case is easy. Assume that the result holds at t — 1. At t, at most S a new 
elements get added to the support, thus \A t \ < |A t _i| + S a < 2 + doS a + S a - Also, since % = Tt-i, thus \7t\ < S. 
And A e t = A e t _i U Rt, indicating |A e t | < |A e . t _i| + |i? t | < S r . The second condition of the theorem ensures that 
<5|T t |+3|A t | < — l)/2. Thus using Lemma Q] \\x t — x t \\ < 7.50e. 

Consider the support detection step. Consider an j ^ A/j, i.e. (xt)i = 0. Since a = ^=7.50e > ;^=||^t — > 
1 1 St — &t||oo > l(^t)i|> m us i will never get detected into the support estimate. Thus, |A e t| = 0. Thus \7t\ < \Aft\ + |A e ,t| < S. 

The third condition ensures that any newly added element exceeds a + -^=7.50e within do time units and any element of C t 
exceeds a + -^=7.50e as I > a + -^=7.50e. Consider any such element j. This means that \(x t )j \ > \ (%t)j I — I ( x t ~ x t)j\ > 
|(a; t )j| — \\xt — Xt\\oc > \(x t )j\ — -^=||xt — x t \\ > |(^t)jl — '^ = '''-^ e — a - Thus such an element will definitely get detected 
into the support. This means that the only nonzero elements that are missed are either those that got added in the last do 
frames or those that are currently decreasing. The maximum number of elements that got added in the last do time units is 
doS a - The maximum number of decreasing elements at t is less than or equal to 2 Sd- Thus, |A t | < 3 Sd + doS a . This 
finishes the proof of the induction step and hence of the theorem. 

E. Proof of Theorem [5] 

Proposition 2 (simple facts): Consider Algorithm [2] 

1) An i e Af t will definitely get detected if \(x t ) t \ > a ad d + -^\\ x t ~ x t . mo dcs\\- 

2) An i e Aft will definitely not be deleted if \(x t )i\ > add + ^§=\\ x t ~ ^t,add||- 

3) All i G A e ,t (the zero elements of 7t) will definitely get deleted if adei > \\ x — ^t,add||oo- 

Recall from the signal model that Aft = U t T=t _ d . +1 A T U Ct U ST> t , noting that the first two sets might not be disjoint. By 
the induction assumption, |7t-i| < S. Since Tt = Tt-i = Aft-x, thus, 

\Tt\ < S (32) 

Also, by the induction assumption, 

At-iCSVt-iUAt-i...At-do (33) 
Recall that Af t = Af t -i UA t \ Tit- Also, SV t -\ C SV t U TZ t . Thus, SV t -i n TZ° t C SV t . Thus, 

A t = Aft n AftU - (Nt-i nn c t n MU) u (A t n Aft) 
c (A H n^)ui ( 

c SV t U At-x ■ ■ ■ U At-do U A t (34) 



Thus, 



|A t | < ^r^-S a + d S a + S a (35) 



Using the above bounds on \%\ and \A t \ and the RIP condition of the theorem, we can apply Lemma [T] to show that 

\\%t %t,modcs 

j| < 7.50e (36) 
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Thus, using the Proposition [2] and condition 3, all elements of At-d are definitely detected in the add step at t, i.e. 

At-do £ At (37) 

Also since £ satisfies condition 3, all elements of Ct will be detected in the add step at t. 
Using <|37}, 

A add ,t = A t \ i t = SV t U At U A t -i ■ ■ ■ U ^t- do \ A 

C 5ft U A U A-i • • • U At-d +i (38) 

Thus, 

|Aadd,t| < ^^S a +dvS a (39) 

Also, %dd,t Q Af t U A ejaddj t and 

A e , add ,t = A e>t U (it \ M) C A e! t_! U ft t U (i f \ M) (40) 

Thus, |Ae !addjt | < 5 a + / and so 

|T adcM | < S + |A e> add,t| < S + S a + f (41) 

By Lemma [3] and condition 2c of the Theorem, we have 

\\(x t - x tMd )\\ < 1.12e+ (1 + 1.2610| raJdtMA;idJ t |)||(x t ) Aiidd J| 

< 1.12e + 1.261|| (x t ) A-d , t || (42) 
Recall that, by Proposition [2] any element of sa 1(1J t will have magnitude smaller than a add + -4=7. 50e. By 4391 , we have 



Na^.J < W|A addj t|(a add + -^=7.50e) 



< J (^T^ a + d Q S a )(a add + ^7.50e) (43) 



Let ft = v /((fe+l) +do )( 

a add + -^=7.50e). Combining this with the bound on |7^ ddjt | and |A add t | we can bound the LS step 
error by a time-invariant quantity, 

\\(x t ~ x tMd ) %M J < 1.12e+ 1.261/1 (44) 

Using Assumption |2] we have, 

IKzt-^add^JU < 1.12-^=e + 0.261Ci/i (45) 
Using the fact that a de i is equal to the RHS of the above equation and proposition fact 3, if (x t )j = 0, then j E TZ t - Thus, 

K c Q Kt (46) 

Next, using 4T9} , 443} , fact 2 of Proposition [2] and the value of a de i, we can conclude the following: if j G £ t j j w iU not g et 
falsely deleted; the same is true if j <G -4 T , t < t — do. Thus, 

K t C AA t c U S2?t U A U A-i • • • U A-do+i (47) 

Recall that A/" t = Sf t -i LlA t \U t . Thus 

At = Af t \Af t = (K n NU n i?) u (M n n f ) 

C (At n i t c ) U (<SX>t U A U At-i . ..At- da +i) (48) 

Since At-d C it, using 434} , we get 

A f n A c t C 5X> t UAfU A-i • • • U -At-do+i (49) 

Thus, using 448} , 

At CSDtU i(U A-1---U A-do+i (50) 
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Thus, 



Now consider A e t . 



|A t | < { -^^-S a + d S a (51) 



= {Aft-! nn c t n W t c ) u (A t nn c t n W t c ) 

As J\f t c C 7t t , we have 7^ c C A/" t . Thus, 

A e>t = (52) 

Thus, 

|A e>t | = (53) 

Since |M| < S and since \ft\ < |M| + [A e ,t|, thus 

|7j| < 5 (54) 

By condition 2, 

< <5s + 3 ( ^±is a+do 5 a+ 5 a ) < 0.207 (55) 

and 

S m] < $s < S s +s a +f < 0.207 
Using the same way as getting \\(xt — it.add)||, we have 

11(^-^)11 < 1.12 e +1.261||x At || 
Also, using Proposition |2] any element of x^ t will have magnitude smaller than a<j e i + 1.12-4=e. By ( fBll l, we have 



II^aJI < ^j(^^-S a + d S a )(a <Sel + 1.12^=e) 

Thus, the final claim is proved. 

F. Proof of Remark U 1\ Necessary conditions 

Consider the noise-free case, i.e. e = and Algorithm Q] We claim that left-RIP(5 + S a ) at all times t > is necessary 
to ensure exact recovery of all sparse signal sequences with support size at most S, and number of support additions and 
removals at most S a . We prove this here. Assume exact recovery at t — 1. Assume also that the support size at t — 1 is S, 
there are S a new additions and S a new removals at time t. Thus support size at time t is also S. 

Suppose that left-RIP(5 + S a ) does not hold. This means there is a set, R, of size S + S a for which rank((At)ii) < S + S a - 
Pick a z so that z R € null((A t ) R ) (i.e. (A t ) R z R = 0) and z R c = 0. Partition R into three sets R = D U D x U D 2 s.t. all 
are disjoint; \D\ = S — S a , \Di\ = S a — | 1 and |jzD 2 ||i < II^Dilli- Create two sparse vectors x 1 and x 2 supported on 
DUfli and D U D 2 respectively as follows. Let {x 1 )^) = z D /2, (x 1 ) Dl = z Dl , (a; 1 )(Dui3i) c = 0. Let (x 2 ) D = —z D /2, 
{x 2 )d 2 = ~ z d 2 ' { x2 )(DuD 2 y- = 0- Then both a; 1 and x 2 have support size S. 

Suppose that the signal at time t is x 1 , i.e. xt = x 1 so that y t = A t x x , and suppose that the support (equal to support estimate) 
from t— 1 is T = OUA e where A e is a subset of (DUDi UD 2 ) C of size S a . Consider the solution of modified-CS with 6 = 0. 
In this case, both x 1 and x 2 are feasible since A t (x 1 — x 2 ) = (A t )^Z£ l /2 + (A t )jj 1 zjj 1 — (A t )z>(—Z£)/2) — (A t )z> 2 (— Zjj 2 ) = 
(A t ) R z R . But, || (a; 1 )£)c || = || (x 1 ) Dl || i = \\z Dl \\i > ||z_d 2 ||i = || {x 2 )^ \\ i- Thus, clearly x 1 will not be the unique solution to 
modified-CS with e = 0. This proves that left-RIP(5 + S a ) is necessary. 
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G. Generative model for Signal Model [2} 

This model requires that when a new element j gets added to the support, its magnitude keeps increasing at rate rj t t until it 
reaches large set, and that an element i of the large set starts to decrease at rate r% t t until it reaches 0. The sign is selected as 
+1 or -1 with equal probability when the element gets added to the support, but remains the same after that. We can choose 
values for a min , d min , r min (d min ), S a ,m,b during simulation. 

Mathematically, it can be described as follows. Let {x t )j = (M t )j(s t )j where (M t )j denotes the magnitude and (s t )j 
denotes the sign of (x t )j at time t. #t is a m x 1 vector; So = [/iiS 1 ], here fii is a random number between 0.9 and 1. 

For 1 < t < b, let S a ,t = 0, S r> t = 0, Sd,t = S a ; For any t > b, do the following. 

1) Generate 

a) the new addition set, At, of size S a j = [^2{^r~=\Sd,T — St^iS'a T )] (here /i 2 is a random number between 0.9 
and 1) uniformly at random from Nt-\ c , 

b) the new decreasing set, Bt, of size Sd.t = [^Sa] (here fi^ is a random number between 0.5 and 1) uniformly at 
random from Ct-i, and 

c) the new deleted set, IZt, of size S r ,t = [ii^SVt-i |] (here fi4 is a random number between 0.1 and 0.3), as the 
smallest S r ,t elements of ST>t-\. 

2) Update the coefficients' magnitudes as follows. 



(Af t ), = 






' (M t _ 1 ) i 


+ n,t, 


i e A t -d min U C t -i \ B tl r J:t = 


{M t -i)i 


+ n,t, 


i ^ ^T=t— dmin+l'^ 7 "' ^*ht M6^min (^min 


< [Mt-x)i 




i 6 SV t -i \ TZ tl r itt = /i 7 |; 


(M t -i) t 


- n,u 


i e Bt,r itt = n 6 {M it t-\ - t); 






i e A/?. 



where [17 and fig are random numbers between 1 and 1.44; /15 is a random number larger than — ((M t -i)i — £). 

3) Update the signs as follows. 

( (s t -i)i, ieAf t \At 
(s t )i = i itd(±l), i e A (56) 

I o, t g a/; c 

where nd(±l) refers to generating the sign as +1 or -1 with equal probability and doing this independently for each 
element i. 

4) Set {x t )i = {M t )i{s t )i for all i. 

5) Update 

C t = A t - dmin U C t -i \ B t , 
SV t = SV t - 1 UBt\Kt. 
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